** Data reading and variable selection from raw data
** Japan 2009 National Survey on Family and Economic Conditions (NSFEC) 


** 01. Reading data **

cap log close
clear all
set more off
cd /*insert you work directory here*/
use /*read your data here*/ 


** 02. Consructing year and country variables **

ge year=2009
lab var year "survey year"

ge country=392
lab var country "ISO country code"
//Japan: 392 (see "ISO Country Codes.pdf) 


** 03. ID variables **

ge pid=UNIQ_ID
lab var pid "person id"


** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=Q1A
lab var sex "sex"
lab def sex 1 "male" 2 "female"
lab val sex sex

ge age=Q1B
lab var age "age"

ge birthyr=year-age
lab var birthyr "year of birth"


** 05. Siblings **

rename Q1FO OLDBRO
rename Q1FY YNGBRO
rename Q1GO OLDSIS
rename Q1GY YNGSIS

ge nbro=0 if OLDBRO==1 & YNGBRO==1
replace nbro=1 if OLDBRO==2 & YNGBRO==1
replace nbro=1 if OLDBRO==1 & YNGBRO==2
replace nbro=2 if OLDBRO==2 & YNGBRO==2
replace nbro=2 if OLDBRO==3 | YNGBRO==3
lab var nbro "number of brothers (in category)"
lab def nbro 0 "no brother" 1 "1 brother" 2 "2 and more brothers"
lab val nbro nbro

ge nsis=0 if OLDSIS==1 & YNGSIS==1
replace nsis=1 if OLDSIS==2 & YNGSIS==1
replace nsis=1 if OLDSIS==1 & YNGSIS==2
replace nsis=2 if OLDSIS==2 & YNGSIS==2
replace nsis=2 if OLDSIS==3 | YNGSIS==3
lab var nsis "number of sisters (in category)"
lab def nsis 0 "no sister" 1 "1 sister" 2 "2 and more sisters"
lab val nsis nsis

ge nsibs=0 if nbro==0 & nsis==0
replace nsibs=1 if nbro==1 & nsis==0
replace nsibs=1 if nbro==0 & nsis==1
replace nsibs=2 if nbro==2 | nsis==2
lab var nsibs "number of siblings (in category)"
lab def nsibs 0 "no sibling" 1 "1 sibling" 2 "2 and more siblings"
lab val nsibs nsibs

ge birthorder=1 if OLDBRO==1 & OLDSIS==1
replace birthorder=2 if OLDBRO==2 & OLDSIS==1
replace birthorder=2 if OLDBRO==1 & OLDSIS==2
replace birthorder=3 if OLDBRO==2 | OLDSIS==3
lab var birthorder "birth order"
lab def birthorder 1 "1" 2 "2" 3 "3 and later"
lab val birthorder birthorder


** 06. Own education **

rename Q1D educ
lab var educ "highest education obtained"


** 07. Parents' education: Father and/or Mother **

rename Q6PA faeduc
lab var faeduc "father's highest education obtained"

rename Q6MA moeduc
lab var moeduc "mother's highest education obtained"


** 08. Own occupation **
rename Q16B occ
lab var occ "respondent's occupation"


** 09. Parents' occupation **

// Not avaiable


** 10. Tabulate the Identified Variables **

log using /*insert you work directory here*/, replace text

** Data reading and variable selection from raw data
** Japan 2009 National Survey on Family and Economic Conditions (NSFEC) 

** Sex **
tab sex

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs nbro nsis birthorder, d

** R's Own Education **
tab1 educ 

** Parental Education **
tab1 faeduc moeduc

** R's Own Occupation **
tab1 occ

** Parental Occupation **
// Not avaiable 

log close

** 11. Keep the identified variables only

keep year country pid sex age birthyr ///
	 nbro nsis nsibs birthorder ///
	 educ faeduc moeduc occ 


** 12. Save the Data File **

saveold /*insert you work directory here*/, replace



** 13. Homoginising education **
** Own Education **
rename educ educ_cat

ge educ_yrs=9 if educ_cat==1
replace educ_yrs=12 if educ_cat==2
replace educ_yrs=14 if educ_cat==3
replace educ_yrs=14 if educ_cat==4
replace educ_yrs=16 if educ_cat==5
replace educ_yrs=4 if educ_cat==6
lab var educ_yrs "respondent highest education in years"

ge educ_ISCED=200 if educ_cat==1
replace educ_ISCED=300 if educ_cat==2
replace educ_ISCED=500 if educ_cat==3
replace educ_ISCED=500 if educ_cat==4
replace educ_ISCED=600 if educ_cat==5
replace educ_ISCED=. if educ_cat==6
lab var educ_ISCED "respondent highest education in ISCED code"

** Parents Education **
//father's education is actually father's
ge faeduc_flag=1 

rename faeduc faeduc_cat
rename moeduc maeduc_cat

ge faeduc_yrs=9 if faeduc_cat==1
replace faeduc_yrs=12 if faeduc_cat==2
replace faeduc_yrs=14 if faeduc_cat==3
replace faeduc_yrs=14 if faeduc_cat==4
replace faeduc_yrs=16 if faeduc_cat==5
replace faeduc_yrs=4 if faeduc_cat==6
replace faeduc_yrs=4 if faeduc_cat==7
lab var faeduc_yrs "father's education in years"

ge maeduc_yrs=9 if maeduc_cat==1
replace maeduc_yrs=12 if maeduc_cat==2
replace maeduc_yrs=14 if maeduc_cat==3
replace maeduc_yrs=14 if maeduc_cat==4
replace maeduc_yrs=16 if maeduc_cat==5
replace maeduc_yrs=4 if maeduc_cat==6
replace maeduc_yrs=4 if maeduc_cat==7
lab var maeduc_yrs "mother's education in years"

ge faeduc_ISCED=200 if faeduc_cat==1
replace faeduc_ISCED=300 if faeduc_cat==2
replace faeduc_ISCED=500 if faeduc_cat==3
replace faeduc_ISCED=500 if faeduc_cat==4
replace faeduc_ISCED=600 if faeduc_cat==5
replace faeduc_ISCED=. if faeduc_cat==6
lab var faeduc_ISCED "father highest education in ISCED code"

ge maeduc_ISCED=200 if maeduc_cat==1
replace maeduc_ISCED=300 if maeduc_cat==2
replace maeduc_ISCED=500 if maeduc_cat==3
replace maeduc_ISCED=500 if maeduc_cat==4
replace maeduc_ISCED=600 if maeduc_cat==5
replace maeduc_ISCED=. if maeduc_cat==6
lab var maeduc_ISCED "mother highest education in ISCED code"


** 14. Homoginising sibling **
//cutoff
ge nbro_flag=2
lab var nbro_flag "cutoff of number of brothers"
ge nsis_flag=2
lab var nsis_flag "cutoff of number of sisters"
ge nsibs_flag=2
lab var nsibs_flag "cutoff of total number of siblings"


** 15. Tab Education and Sibling Variables **
tab1 sex age birthyr
tab1 educ_cat educ_yrs faeduc_cat faeduc_yrs maeduc_cat maeduc_yrs faeduc_flag 
tab1 nbro nsis nsibs nbro_flag nsis_flag nsibs_flag


** 16. Save the Data File **

saveold /*insert you work directory here*/, replace

